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1. Introduction 



1.1 The Curie- Weiss-Potts model The Curie- Weiss- Potts model is a mean field approxi- 
mation of the well-known Potts model, a famous model in equilibrium statistical mechanics. It 
is defined in terms of a mean interaction averaged over all sites in the model, more precisely, 
by sequences of probability measures of n spin random variables that may occupy one of q 
different states. For q = 2 the model reduces to the simpler Curie- Weiss model. Two ways 
in which the Curie- Weiss-Potts model approximates the Potts model are discussed in |18J and 
[17J. Probability limit theorems for the Curie- Weiss- Potts model are proved first in [13]. One 
reason of interest in this model is its explicit exhibition of a number of properties of real sub- 
stances, such as multiple phase transitions, metastable states and others. In comparison to the 
Curie- Weiss model it has a more intricate phase transition structure because for example at 
the critical inverse temperature it has not a second-order phase transition like the Curie- Weiss 
model but a first order transition. In order to carry out the analysis of the model, detailed 
information about the structure of the set of canonical equilibrium macro-states is required. 
The probability observing a configuration a G {1, . . . , q} n in an exterior field h equals 

1 f 8 - \ 

Pp,h,n{v) = £ exp — 22 + h J2 5 ^' 1 ( L1 ) 

> h ' n \ l<i<j<n i=l / 

where 5 is the Kronecker symbol, 8 := T _1 is the inverse temperature and Zp t h,n is the normal- 
ization constant known as the partition function. More precisely: 

Z P,h,n = E ex P I h E <Wj +hjr 5 aiA 

ae{l,...,q} n \ l<«<j<n i=l 

For 8 small, the spin random variables are weakly dependent while for 8 large they are strongly 
dependent. It was shown in [27] that at h — the model undergoes a phase transition at the 
critical inverse temperature 

\q , if g < 2 

Pc={ , 1-2 
[22=|log(g-l) , if q>2- 

and that this transition is first order if q > 2. Our interest is in the limit distribution of the 
empirical vector of the spin variables 

tn n \ 

^<w,---,IX<0 (L3) 
1=1 1=1 / 

which counts the number of spins of each color for a configuration a. Note that the normalized 
empirical vector L n := N/n belongs to the set of probability vectors 

H = {x G R q : Xx H h x q = 1 and x { > 0, Vz}. 
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For q > 2 and 8 < (3 C L n satisfies the law of large numbers Ppfl >n (L n G dvj =>- 5 UQ (du) as 
n — > oo, where u = (1/q, . . . ,1/q) G W. For /3 > (3 C the law of large numbers breaks down and 
is replaced by the limit 

1 q 

^ i=l 

where {ui(B),i = 1, . . . ,q} are g distinct probability vectors in ~R q , distinct from z/ . The /irs£ 
order phase transition is the fact that for i — 1, . . . , q 

lim Ui(/3) ^ u , 

see [1 3j . The case of non-zero external field h ^ was considered in [2] and it turned out that 
the first-order transition remains on a critical line. The line was computed explicitly in [3], see 

(HI. 

In the present work we will obtain certain known probabilistic limit theorems for the Curie- 
Weiss- Potts model, especially for the empirical vector of the spin variables N, but at the same 
time we present rates of convergence for all the limit theorems. We consider the fluctuations 
of the empirical vector N around its typical value outside the critical line and we describe the 
fluctuations and rates of convergence at a extremity of the critical line. This extends previous 
results on the Curie- Weiss Potts model with no external field p3] as well as with external field 
[15J. The method of proof will be an application of Stein's method of so called exchangeable 
pairs in the case of multivariate normal approximation as well as the application of Stein's 
method in the case of non-Gaussian approximation. Stein's method will be explained later. 

We turn to the description of the set of canonical equilibrium macro-states of the Curie- Weiss- 
Potts model. These states are solutions of an unconstrained minimization problem involving 
probability vectors on M. q . The macro-states describe equilibrium configurations of the model in 
the thermodynamic limit n — > oo. For each i the i'th component of an equilibrium macro-state 
gives the asymptotic relative frequency of spins taking the spin value i with i G {l,...,q}. 
We appeal to the theory of large deviations to define the canonical equilibrium macro-states. 
Sanov's theorem states that with respect to the product measures P n (uS) = l/q n for uj G 
{1, . . . , q} n the empirical vectors L n satisfy a large deviations principle (LDP) on H with speed 
n and rate function given by the relative entropy 

I(x) = S ^x i \og(qx i ), x G H. 

i=i 

We use the formal notation P n (L n G dx) ~ exp(— nl(xj) (precise definition: [9j). The LDP for 
L n with respect to Pp t h,n can be proved as in [TTJ. Let 

q B 9 

ffs,hi x ) = ^Xjlog(gx 4 ) - - - hxi, x eV.. 

i=l i=l 
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Then Pp,h,n( L n G dx ) ~ exp(-raJ /3> / l (x)) with 

Jp,h( x ) := fp,h{x) ~ w£fa h (x), 

see also [8]. Now if Jp.hiy) > 0, then v has an exponentially small probability of being observed. 
Hence the corresponding set of canonical equilibrium macro-states are naturally defined by 

£p,h '■= { v e H : v minimizes fp,h{v)}- 

Remark that the specific Gibbs free energy for the Curie- Weiss-Potts model is the quantity 
ijj(fi,h) defined by the limit 

-pil>(j3 } h) = lim -]ogZp hn . 

n—>oo fl 

From the large deviations result it follows that 

-/3ip(f3,h) = - inf fp,h{x). 

In the case h = and q > 2, it is known since [27] (for detailed proofs see [8j Theorem 3.1]) 
that £g )0 consists of one element for any < ft < (3 C , where ft c is the critical inverse temperature 
given in (11.21) . For any (3 > (3 C , the set consists of q elements and at (3 C it consists of q + 1 
elements. In the case with an external field h > the global minimizers of fp t h can be described 
as follows. In [3] the following critical line was computed: 

{P,h):0<h<ho andfr = log(g-l)-/3 g g ~ 2 } (1.4) 
with extremities (P c ,0) and (/3 ,h ), where 

/3 = 4^^ and h = log(g - 1) - 2- ? ~ 1 



q q 
((Po,ho) were already determined in [2]). Now consider the parametrization 

- f 1 + z l ~ z l ~ z \ r_i il 

Xz '-~ V 2 '2(g-l)'"''2(g-l);' 2:61 ' V 

Depending on the parameters (/3, h) the function presents one or several global minimiz- 
ers. The following statement summarizes the results of [27], [8] in the case h = and of [3] for 
h>0. 

Theorem 1.1. Let 0,h>O. 

(1) If h > and (/3,h) ^ Ziy, tne function fp^ has a unique global minimum point in %. 
This minimizer is analytic in (3 and h outside of U {(Po, ho)}. 

(2) If h > and (/3, ft,) e /it, the function fp t h has two global minimum points in %. More 
precisely, for any z 6 (0, (g — 2)/q), the two global minimum points of fp z ,h z at 

P z = 2— log ( and h z = log(g - 1) - J^P. 
zq \l-zj 2(g-l) 

are the points x± z . 
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(3) If h — and (3 < (3 C , the unique global minimum point of fp$ is (1/q, ... ,1/q). 

(4) If h = and (3 > (3 C , there are q global minimum points of fp$, which all equal x z up to 
a permutation of the coordinates for some z G ((g — 2)/g, 1). 

(5) If h = and j3 = (3 C , there are q + 1 global minimum points of /s,o ; the symmetric one 
{1/q, . . . ,1/q) together with the permutations of 

f q-i i i \ 

V Q ' Q(Q- 1)'"'' q(q-l)J ' 

Interesting enough, the very first results on probabilistic limit theorems ([12] for the Curie- 
Weiss model and [13] for the Curie- Weiss- Potts model) used the structure of the global minimum 
points of another function Gp t h- F° r the Curie- Weiss-Potts model with h = this function is 
given by 

1 q 
Gp fi {u) := -P(u,u) - log ^ e^, u G R q . 

i=i 

With convex duality one obtains the alternative representation of the specific Gibbs free energy: 

Pi>(P, 0) = min G/3 t0 (u) + log q. 

Actually fp$ and Gp$ have the same global minimum points, see [18] for ^ (3 C . A proof 
of this result for any > can be found in [H Theorem 3.1]. The main reason to use Gpp 
instead of fp t o is the usefulness of a representation of the distribution of L n in terms of Gp t h, 
called Hubbard- Stratonovich transform (see [T3| Lemma 3.2] and the proof of Lemma 1431 in this 
paper). This is a famous tool since the work of Ellis and Newman [12] . For > and h real 
the global minimum points of fp^ coincide with the global minimum points of the function 

1 9 

Gp, h (u) ■= - (3 (u, u) -logQ^ exp((3ui + hSi^)), ueR q (1.5) 

i=i 

(for a proof see [141 Theorem B.l]; or apply a general result on minimum points of certain 
functions related by convex duality, Theorem A.l], see also [26]). Hence we know that all 
statements of Theorem 11.11 hold true for Gp^. 

Corollary 1.2. The statements in Theorem \l.l\ for the global minimum points of fp^ hold true 
one to one for Gp t h, defined in (jl.5p . 

The detour first describing the canonical equilibrium macro-states of the Curie- Weiss-Potts 
model using large deviation theory and second using convex duality has the following reason. 
Applying Stein's method we will automatically meet the function Gp^ and the limit theorems 
and the proof of certain rates of convergence will depend on the location of the global minimum 
points of Gp ; h (as in [12], [13] and JT4] and a lot of other papers). But for h > only fp ;h and 
its minimizers were completely characterized in the literature, see Theorem 11.11 So we had to 
argue that we also know the phase diagram of Gp^. 
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1.2 Stein's method of exchangeable pairs Starting with a bound for the distance between 
univariate random variables and the normal distribution Stein's method was first published in 
[2"5] (1972). Being particularly powerful in the presence of both local dependence and weak 
global dependence his method proved to be very successful. In [23] Stein introduced his ex- 
changeable pair approach. At the heart of the method is a coupling of a random variable W 
with another random variable W such that (W, W) is exchangeable, i.e. their joint distribu- 
tion is symmetric. Central in his approach is the fact that for all antisymmetric measurable 
functions F(x, y) we have E [.F(W, W')] = if the expectation exists. Stein proved further on 
that a measure of proximity of W to normality may be provided by the exchangeable pair if 
W — W is sufficiently small. He assumed the property that there is a number A > such that 
the expectation of W — W with respect to W satisfies 

E[W - W | W] = —AW. 

Heuristically, this condition can be understood as a linear regression condition: if (W, W) were 
bivariate normal with correlation g, then E(W|W) = gW and the condition would be satisfied 
with A = 1 — g. Stein proved that for any uniformly Lipschitz function h, \Eh(W) — Kh(Z)\ < 
5\\h'\\ with Z denoting a standard normally distributed random variable and 

+ — E|W -W'| 3 . (1.6) 
2A 

Stein's approach has been successfully applied in many models, see e.g. [23] or [25] and refer- 
ences therein. In [21], the range of application was extended by replacing the linear regression 
property by a weaker condition assuming that there is also a random variable R = R(W) such 
that 

E[W - W | W] = -AW + R. 

While the approach has proved successful also in non- normal contexts (see [3], [6] and [10]) it 
remained restricted to the one-dimensional setting for a long time. The problem was that it 
was not obvious how to transfer the linearity condition into the multivariate case. However in 
[5] this issue was finally addressed. They extended the linearity condition to the multivariate 
setting such that, for all i £ {1, ... , d}, E[W/ — W, | W] = — AWj for a fixed number A, where 
now W = (Wi, . . . ,Wd) and W = (W[, . . . , W^j are identically distributed d-vectors with 
uncorrelated components. As in the univariate case an extension to the additional remainder 
term R would be straightforward. This coupling has the purpose to estimate the distance to the 
standard multivariate distribution. Applying the linear regression heuristic in the multivariate 
case leads to a new condition due to [19] : 

E[W - W | W] = -AW + R (1.7) 

for an invertible d x d matrix A and a remainder term R = R(W). This linearity condition is 
more natural than the one of [5]. Different exchangeable pairs, obviously, will yield different A 
and R. 



4E 1 - — E((W - W) 2 |W) 
2A 
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Interesting enough the Curie- Weiss-Potts model will be an example to demonstrate the power 
of the approach in [19J. Constructing an exchangeable pair in the Curie- Weiss-Potts model to 
obtain an approximate linear regression property ( I1.7P leads us to the function Gp^. This will 
be sketched now. Let q > 2, h = and f3 < (3 C , and let x denote the unique global minimum 
point of Gpfl, see Theorem 11.11 We consider 



W := v 7 ^^- - XoJ = Vn(L n - x ). 



We produce a spin collection o' = (<7f)t>i via a Gibbs sampling procedure: Let I be uniformly 
distributed over {1, . . . , n} and independent from all other random variables involved. We will 
now replace the spin (7j by a\ drawn from the conditional distribution of the i'th coordinate 
given {o~j)j^i, independently from cr*. We define 

and consider 

W :=W-^= + ^. (1.8) 



'n y/n 

Hence it is not hard to see that (W, W) is an exchangeable pair. This construction will also 
be evident for all the proofs in this paper. Let T := a(a 1 , . . . , a n ). We obtain 

E [W! -Wi\J] = -^=E [Y^ - Y Jti | T] 



1 1 " 



Jn n ' L J ' J 

3=1 

j n 11™ 
n Jn n ^— ' L J 



3=1 v j=1 



Using our construction we obtain 



E 



E [S aj}i | (fXt)^] = P/3,0,n (o-j = i I (o- t ) t7 Lj) 



By a straightforward calculation (see Lemma 14.31) we get that 



Pp,0,n fa = i I ( a t)t^j] 



exp (I3m i:j (a)) 
X] exp (/3m k)j (a)) 

k=l 



with m i) j(a) := ± <W- Using the notion m^cr) := i ^™=i <W one obtains 
1 1 ^ exp(/3m^(q)) 1 exp {(Im^a)) 

7/^-2^- = 77^~ + (1-9) 

j'=i J] ex P (P m k,j{o-)) Y, ex P (/3m k (<j)) 

k=l k=l 

Moreover by definition of Gp t h (see Lemma 14. 2D we obtain 

exp(/3u m ) Id 

- u m - —- — Gp fi {u). 



Z)fc =1 exp(^ti A ) m f3du T 
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Summarizing we obtain 

E[W! - Wi\F] = --Wi - ^ + Rn(i) + 4= (W) - \ir G U™{°))\ (i-io) 

77. y 71 y 72 V p C/Uj y 

with m(a) = (toi((t), . . . , m g (cr)). Hence using all informations on Gp h (Taylor-expansion and 
the results on the global minimum points, Theorem II .![ Corollary ll.2p . it seems to be possible 
to calculate A and R in the regression condition ( II. 7} . Indeed we are able to proof that this 
condition is satisfied for any (/3, h). 

In Section 2 of the present paper, the limit theorems and the rates of convergence are stated. 
They include a central limit theorem and a bound on the distance to a multivariate normal 
distribution for L n outside the critical line. When Gp t h has several global minimizers, that is 
when h) £ h? or /3 > /3 C and h — 0, the empirical vector L n is close to either one or the 
other of the minimizers. In this case we determine a central limit theorem with conditioning 
for L n . Next we describe the fluctuations at the extremity (po, h Q ) of the critical line, again 
combined with a rate of convergence. In Section 3 we state an abstract nonsingular multivariate 
normal approximation theorem for smooth test functions from [19]. Moreover we present a 
Kolmogorov bound (nonsmooth test functions) for bounded random vectors W — W under 
exchangeability. Finally we state an abstract non-Gaussian univariate approximation theorem 
for the Kolmogorov-distance from [10] . Section 4 contains some auxiliary results which will be 
necessary for the proofs given in Section 5. 

2. Statement of results 

Let us fix some notation. From now on we will write random vectors in M. d in the form 
w = (wi, . . . , WdY, where Wi are M-valued variables for i = 1, . . . , d. If a matrix E is symmetric, 
nonnegative definite, we denote by E 1 / 2 the unique symmetric, nonnegative definite square root 
of E. Id denotes the identity matrix and from now on Z will denote a random vector having 
standard multivariate normal distribution. The expectation with respect to the measure Pp^h,n 
will be denoted by E := Kp p hn . 

Let q > 2. We first consider the issue of the fluctuations of the empirical vector defined 
in (II. 3p around its typical value. The case of the Curie- Weiss model (q = 2) was considered 
in [22] and |12] . A Berry- Esseen bound was proved in [10J (and independently in [6]). The 
Curie- Weiss-Potts model was treated in [13] and for non-zero external field in [T5| Theorem 
3.1]. To the best of our knowledge there are no Kolmogorov bounds known. 

Theorem 2.1. Let (3 > and h > with (/3,h) ^ ((3 ,h ). Assume that there is a unique 
minimizer x of Gp^. Let W be the following random variable: 

fN 

W := y/n[ x 

\ n 
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If Z has the q- dimensional standard normal distribution, we have for every three times differ- 
entiate function g, 

\Eg(W) - Eg (E~ 1/2 Z) | < C ■ n~ 1/2 , 
for a constant C and E := E [W W 1 }. Indeed we obtain that C = O (g 6 ). 

Remark that we compare the distribution of the rescaled vector N with a multivariate normal 
distribution with covariance matrix E[WW*]. This is an advantage of Stein's method: for 
any fixed number of particles/spins n, we are able to compare the distribution of W with a 
distribution with the same n-dependent covariance structure. 

In order to state our next result we introduce conditions on the function classes Q we consider. 
Following (20], let $ denote the standard normal distribution in M. q . We define for g : M. q — > R 

gj{x) = sup{g(x + y) : \y\ < 5}, (2.1) 
gj(x) = mf{g(x + y):\y\<5}, (2.2) 

g( x , s ) =gt( x ) -9s( x )- ( 2 - 3 ) 

Let Q be a class of real measurable functions on M. q such that 

(1) The functions g G Q are uniformly bounded in absolute value by a constant, which we 
take to be 1 without loss of generality. 

(2) For any q x q matrix A and any vector & 6 1 ? , g(Ax + 6) 6 Q. 

(3) For any 5 > and any g G Q, g~$(x) and gj(x) are in Q. 

(4) For some constant a = a{Q,q), sup < j g(x,5)$(dx) > < a5. Obviously we may assume 
a > 1. 

Considering the one dimensional case, we notice that the collection of indicators of all half lines, 
and indicators of all intervals form classes in Q that satisfy these conditions with a = 
and a = 2^J2/tt respectively. This was shown for example in [20]. In dimension q > 1 the class 
of indicators of convex sets is known to be such a class. Using this notation we are able to 
present an equivalent Theorem to 12.11 for our function classes Q. 

Theorem 2.2. Let (3 > and h > with (f3,h) ^ ((3 ,h ). Assume that there is a unique 
minimizer xq of ' Gp^- Let W and Z be as in \2.1[ Then, for all g G Q with \g\ < 1, we have 

\&g{W) - Eg (Y,- 1/2 Z) \ < Clog(n) ■ n- 1/2 , 

for a constant C and E := E [W W 1 }. 

Letting Q be the collection of indicators of lower quadrants the distance above specializes to 
the Kolmogorov distance. 

When the function Gp^ has several global minimizers, the empirical vector N/n is close to 
either one or the other of these minima. We determine the conditional fluctuations and a rate 
of convergence: 
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Theorem 2.3. Assume that /3, h > and that Gp.h has multiple global minimum points 
X\, . . . , xi with I G {2, q, q + 1} and let e > be smaller than the distance between any two global 
minimizers of Gp^- Furthermore, let := ^/n (— — Xj). Then, if Z has the q-dimensional 
standard normal distribution, under the conditional measure 

Ppxn (■\^^B{x i ,e)), 

we have for every three times differentiable function g, 

\Rg(W®) - Eg (Z-V 2 Z) | < C ■ n- 1 / 2 , 

for a constant C and £ := E [W w [W^f] . B(x h e) denotes the open ball of radius e around 
X{ . 

We note that we get a similar result as in Theorem 12.21 for the function classes Q in the case 
of several global minimizers. Finally we will take a look at the extremity (/3o, ^o) of the critical 
line hr- Given a vector « 6 f, we denote by u L the vector space made of all vectors orthogonal 
to u in the Euclidean space M. q . Consider the hyperplane 

M := lx e W : J^Xi = oj, (2.4) 
^ i=i ' 

which is parallel to %. The fluctuations belong to Ai, since all global minimizers are in %. 
The following result extends [121 Theorem 3.9] which applies to the case of the Curie- Weiss 
model at the critical inverse temperature. We remind the fact that at (/?o, ^o) the function 
Gp 0t h has the unique minimizer x = (1/2, l/2(g — l),...,l/2(g — 1)) 6 R q . Now we take 
u = (1 — q, 1, . . . , 1) G Ai C M. q and define a real valued random variable T and a random 
vector V G Ai fl u 1 - such that 

N = nx + n 3/ *Tu + n 1/2 V. (2.5) 

Since N — nx G Ai, the implicit definition of T and V presents a partition into a vector in (the 
subspace spanned by) u and u x . The main interest is the limiting behaviour of T. The new 
scaling of W is given by 

N,; — n nt 1 ,s 
3 ni/ T 1] =T + V J /n^, j=2,...,q, 

and its possible limit we observe is reminiscent to [12], see also [?]. The following theorem gives 
a Kolmogorov bound for Theorem 3.7 in [15J. 

Theorem 2.4. For (f3,h) = (f3 ,h ) let x = (1/2, l/2(q - 1), . . . , l/2(q - 1)) be the unique 
minimizer of Gp ^ and u = (1 — q, 1, . . . , 1). Furthermore, let Z q ^ be a random variable 
distributed according to the probability measure on 1R with the density 

1 



fg,T(t) ■= fq,T,n ■= C ■ exp ( - ^^4) ^ 
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where T is defined in (12. 5p . Then we obtain for any uniformly Lipschitz function g : IR — > R 
that 

\&g{T)-^g{Z q , T )\<C-n- 1 l A . 

Moreover we obtain 

sup |P(T < t) — F q ^{t)\ < C ■ n' 1 / 4 (bound for the Kolmogorov- distance), 

teK 

where F Qi t denotes the distribution function of f q> T- 

Remark 2.5. As we will see in the proof of Theorem 12.41 the density f q> T originally has the 
form 

( 4(g - l) 4 , 4 
6XP V 3E(T^(T)) 

(up to a constant) with a function ^ such that E(T^(T)) = 16 ^~ 1 ^ E(T 4 ). From [151 Theorem 
3.7] we know that T converges in distribution to the probability measure on IR proportional to 

4(g-l) 



= exp 



Hence we conclude that limn-j.oo f q: T,n = gq point-wise and therefore M^i^E[T 4 ] 1. Re- 
mark, that the rate of convergence of Theorem 12.41 also holds true when we compare the 
distribution of T with the law on IR with density g q . 



Additionally we get a theorem for the random vector V, improving Theorem 3.7 in |15j . 

Theorem 2.6. Let V be defined as in (12.51) . For (/3,h) = (/3o,h ) and £ := E[VV*] we have 
that for every three times differentiate function g, 

|E#(F) - E# {^- l ' 2 Z) \ <C- n~ l /\ 

where C is an absolute constant. 

Remark 2.7. The proofs of Theorem 12.41 and Theorem 12.61 employed a fourth-order Taylor 
expansion of G/3 0t h , see (16. lip and (I6.12p in the Appendix. Without a doubt, the first and the 
third term in (16.111) . as well as in (16.121) . gives the order (9(n -1 / 4 ). 



3. Stein's method 

Let us fix some more notations. The transpose of the inverse of a matrix will be presented 
in the form A" f := (A^ 1 ) 1 . Furthermore we will need the supremum norm, denoted by || • || 
for both functions and matrices. For derivatives of smooth functions / : M. d — > IR, we use the 
notation V for the gradient operator. For a function / : ~R d — > IR, we abbreviate 

d d 2 
|/|i := sup || — / ||, [/[a := sup || / ||, (3.1) 

i i,j OX{OXj 
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and so on, if these derivatives exist. 

The method of Stein is based on the characterization of the normal distribution that Y G W 1 , 
d G N, is a centered multivariate normal vector with covariance matrix £ if and only if 



E [V*EV/(y) - y*V/(y)] = for all smooth / : 



(3.2) 



It is well known, see pQ and [16], that for any g : 



— > 



being differentiable with bounded 



first derivatives, if E G Mr is symmetric and positive definite, there is a solution / : 
to the equation 

V'EV/H - w<Vf(w) = g(w) - Eg (E~^ 2 Z) , (3.3) 
which holds for every w G M d . If, in addition, g is n times differentiable, there is a solution / 



which is also n times differentiable and one has for every k 



n the bound 



d k f(w) 



n 7 =i dwi 



1 

< - 

- k 



d k g(w) 



(3.4) 



for every w G M. d . We will apply Theorem 2.1 in [19J: 



Theorem 3.1. (Reinert, Rollin: 2009) 

Assume that (W, W) is an exchangeable pair of M. d -valued random vectors such that 

E[W] = 0, E[W W*\ = X, 

with E G M dxd symmetric and positive definite. If (W, W) satisfies ( 11.7j) for an invertible 
matrix A and a a {W) -measurable random vector R and if Z has d-dimensional standard normal 
distribution, we have for every three times differentiable function g, 



\Eg{W) - Eg (^Z) \ < l -^A + l -^B + ( 



\9\i + ~ 2 d 



11/2 



g\2)C 



(3.5) 



where, with AM : = |(A : ) 

m=l 



.4 



B 



C 



d 

E 



A^y V [E[(W/ - ^(Wj - M^-) | W}] , 

i,j,k=l 



(3.6) 



i=l 



The advantage of Stein's method is that the bounds to a multivariate normal distribution 
reduce to the computation of, or bounds on, low order moments, here bounds on the absolute 
third moments, on a conditional variance and on the variance of the remainder term. Such 
variance computations may be difficult, but we will get rates of convergence at the same time. 
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In the same context as in [19] we can show the following Theorem, presenting bounds in Kol- 
mogorov distance. Our developement differs from Reinert and Rollin, as we use the relationship 
to the bounds in J2JJ. We obtain a bound of order log(ra) • n -1 / 2 assuming some boundedness, 
improving Corollary 3.1 in [19] . 

Theorem 3.2. Let (W, W) be an exchangeable pair with E[WW*] = E. Again we assume that 
(W, W) satisfies (11. 7p for an invertible matrix A and a a {W) -measurable random vector R and 
additionally \ W -W\ < A. Then, 

sup \Eg(W) - Eg(Z)\ < C[\og(n)A 1 + (log(n)HSH 1 / 2 + l) A 2 

+ (1 + log(n) EE|Wi| + a)A 3 A 3 
i=i 



+aA 3 (± + A 3 )} 



where, with \® := ^ |(A 



1N ) ■ 



m=l 



q 



a 2 = j2 x(t W E ^ 

i=i 

As = X>« 

i=i 

C denotes a constant and a > 1 is taken from the conditions on Q. 

Proof. To prove the Theorem, first we assume that S = Id. Throughout the proof we write 
C for universal constants, not necessarily the same at each occurrence. First we consider the 
multivariate Stein equation deduced from (13. 3p with £ = Id given by 

V*V/(w) - w* ■ V/H = g(w) - E\g(Z)). (3.7) 

For g e Q define the following smoothing 

9t{x) = j g (Viz + a/1 - tx^j $(z)dz. 

m 

For g t ( 13. 7p is solved by the function 

i 

ds 



ft(x) = - 1 -J{g s (x)-E[g(Z)])j 
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see [16J. Again by [16J, we have that for \g\ < 1, there exists a constant C, depending only on 
the dimension q, such that in the notion (13.1 j) 

< C (3.8) 
|/t| a < ClogCt- 1 ). (3.9) 

According to [TB] (see also [T^j, Lemma A.l) there is also a constant C > 0, depending on q, 
such that for all t G (0, 1): 

sup{|E(/(W) - %(Z)| : g E Q) < C ■ S t + ay/i, (3.10) 

where a > 1 is the constant of Q and 

5 t :=sup{\Eg t (W)-Eg t (Z)\:geg}. 

Thus, it remains to estimate 5 t - Using exchangeability and the linearity condition we have 

= ]f[{W - WfA-^VMW') + Vf t (W))] 
= E[(W - WfA-'VMW)] + \e[{W - WfA-'iVftiW) - Vf t (W))] 
= E[R t A- t Vf t (W)} -E[WVf t (W)] + ^E[{W' -W) l A~\V f t {W r ) - V f t {W))]. 
Abbreviating := -£rft and := d f 2 g x . ft, etc. for a function f t , we obtain 

E[W*V/ t (W)] = ^[{W -W) l A-\V f t {W) -V f t {W))] +E[R t A- t Vf t (W)] 

= \ E (A- 1 ) m ,*E[(l^-Wi)(/ i {1) (W')-/ i (1) (W))] 

m,i,j=l 

+ E (A-^^E^/fC^)]. 

Hence, 

E</ t (W)-% t (£) 

= X>[/i?w]-~ E (A-v.E^-^K/fri-zfw)] 

i=l m,i,j=l 

- E (A-V^if (W)] 

= ^4 2 E4?w-2 e (a-'ww^w 

i=l m,i,j=l 

~ E ( A_1 )m,i(^ - - WyO/jjW] 

m,ij=l 
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+ e[ E (a l )„,,/Mi; / /;o : (u-) E (A-'W/fW] 

m,i,j=l m,i,j=l 

E (A-i.t^-^K/fn-jjV)) 



m,i,j=l 



=: Ji + J 2 + J 3 . 

Using (EHD and the fact that E[(W - W0(W - W)*] = 2A* - 2K[WR t ] we get 



i=i 



m,i,j=l 



m,i,j'=l 

^Clog^" 1 ) ^ (A- 1 ) mil E|2(A) m , i -2^-E[(^ , -^)(W^ , -^)|W / ] 

m,i j'=l 

9 



^Clog^ 1 ) ^ (A- 1 ) m , J y / E[(2(A) mii -2^-E[(^ / -^)(lUj- W^)|W]) ] 

m,«,j'=l 
9 



= ciog(r 1 ) ^ (A-^i^vlEKw^-w-OCWi-W})!^]. 

m,i,i=l 

Additionally, again using 

(JE3D and flSU) and the fact that E[WW*] = E, we have 



m,i j=l 



m,j j=l 



<Clog(r') E (A- 1 ) m ,E| J R i ^|-C7^(A- 1 ) m ,E| J R J | 

m,i j=l 
9 



^Clog^ 1 ) E (A _1 )m,i\/ E 



m,i=l 

The estimation of J3 is a bit more involved. We have 
9 



12 J 3 



£ E[(A _1 ) m( i(W/ - W t )(ff\W) - ff\w)) 

m,i,j=l 



(A-'um - Wi)(w; - Wj)f$(w)] 
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< (A-%.<|E[(W?-Wi)(^ 

m,i,j=l 

1 



< 



(A^)^ E[A 3 A 1 - - r(W - W))dr] 



We abbreviate M := W 7 ' — r{W' — W). It is important to notice that we can use ( 13. 7p to obtain 
for w eR q 



j,k=i 

Additionally we notice that 



k=l k=l 



j,k=l 



J(l - r)dr = i and J (I - r)MidT 



<|W/| + |Wi|. 



Thus, 



|2J S | < £ (A- 1 )^ 3 E[/(l - r)[-£ /f(M) - £ Mjf^l(M) + j^g^\M)}dr] 

m,i=l n k=l j,k=l k=l 



< 



m,i=l 



(A- 1 )mAA 3 C + A s ^C71og(r l )E[|W5| + l^-l] 



+ 



(l-r) S f(M)rfr] 



fe=i 



< 



m,«=l 

By partial integration we obtain 



^(l + log^^ElK^I) +T 



£ 9 i : 

fc=i 



< ' > / tr ) = J] lg(y/iz+ VT^tW ) (Z)d*. 



Keeping in mind that ^ J § k (z)dz = and using the definitions (12. ip . (12. 2 p and (I2.3P we 

fc=lR8 

obtain 



T < 



k=l 




J2 E il I ^ - r)g(Vtz + VT^tM)^\z)dzdT] 



R9 
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y/T=t 



A 



<-i 



< VT^t 



k ~ l R9 

q 



^ 3 £ E [/ \9 + ^A + viV4 (^W) - 9v ^ tA+v - M (VT^tW)] | d>« (*) | dz] 

k — 1 TED n 



^V* 1,-1 ^ s v ' 

K« =:g(Vy,A,t,2) 

A 3 ^ E [ y [g(W, A, t, z) - g(Z, A, t, z)] |$« (z) \ dz] 



< VT^t 



+ 



k=l 

q 



jjf A3 i2 E [J 'g(z,A,t,z)\*£\z)\dz\ 

fc=l Rq 

=: B 1 + B 2 . 

With 5 := sup{|E[c/(W0] - %(-£)] | : £ G £} we obtain 



Si < - — ^A 3 • <5 < -^A 3 . 
Furthermore by using the conditions established for the function classes Q 



B 2 <^A 3 .a 
2^1 



VT^A + Y, J Vi\z\\®k\z)\dA 

\ k=1 Ri / 



V ' " l -A' + aC^^A 3 



2^1 



< aCA 3 f 4= + 1 



Vt 

Thus, combining the estimates of Ji, J 2 and J3 with ( I3.10P we have 

q 

5 < c[\o g (r l )Ai + h g (t- 1 )A 2 + a 3 a 3 (i + logOr 1 ) ^e|^-|) 



i=i 



5 A 
+ -^A 3 A 3 + aA 3 A 3 {- T + 1)] + aVi. 

Setting y 7 ^ = 2CA 3 A3, provided it is less than 1, simple manipulations yield the result for 
£ = Id. If i > 1 for the choice above, then by enlarging C as necessary, the theorem is trivial. 
For general £ we can standardize Y = Y^ X I 2 W . With the conditions of Q we have that 
g(Y>~ l l 2 x) G Q. Hence the bounds (13. 8p and (13.91) can be applied. The proof now continues as 
for the T, = Id case, but with the standardized variables. □ 
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In [10], as well as in [6], Stein's method of exchangeable pairs was introduced for non- normal 
distributional approximations. They consider a class of densities on R which was originally 
introduced in [23]. Let p be a regular, strictly positive density on the interval / = [a,b]. We 
suppose that this density has a derivative p' that is also regular on / with countable many sign 
changes. Furthermore p' should be continuous at the sign changes and J T p(x)\ \og(p(x))\dx < 
oo. Additionally we assume that 

4>(x) := ^ (3.11) 
p(x) 

is regular. A density p fulfilling these conditions will be called nice. Now a random variable Z 
is distributed according to p if and only if 

E[f(Z)+^(Z)f(Z)] = 

for a suitable class of functions. The corresponding Stein-identity is 

f'(x) + i;(x)f(x) = g(x)-P(g), (3.12) 

X 

where g is a measurable function for which J \g(x)\p(x) dx < oo, P(x) := J p(y) dy and 

I — oo 

P(g) : = J g(y)p{y) dy. For the proof of Theorem 12.41 we will apply Theorem 2.4 and Theorem 
2.5 in [10]: 



Theorem 3.3. Let p be a nice density. Let pw be a probability distribution such that a random 
variable Zw is distributed according topw if and only ifE,(E,[Wip(W)}f'(Zw)+'4'(Zw)f(Zw)) = 
for a suitably chosen class of functions and with ip as in (13. lip . Let (W, W) be an exchangeable 
pair of real-valued random variables such that 

E[W'- W | W] = \<tp(W) -R(W) (3.13) 

for some random variable R = R(W), < A < 1 and ip as in (13.111) . 

(1): Let us assume that for any absolutely continuous function g the solution f g of (13.121) 
satisfies 

II f g \\< ci || g' ||, || f g || < c 2 || g' || and \\ f' g ' \\< c 3 || g' \\ . 
Then for any uniformly Lip schitz function g, we obtain |E [^(W 7 )] — E [^(Z^y)] | < 5 \\ g' || with 



5 := £(V(E[(W - W'f\W])) l/2 + %W- Wf + ° 1 + C2 i nW2) ^W)- (3-14) 
2 a 4A A 

(2): Let us assume that for any function g(x) := l^ x<z y(x), zGR, the solution f z of (I3.12p 

satisfies 

\f z (x)\<d l , \f z (x)\<d 2 and \f' z (x)-f' z (y)\<d3 

and 

Mx)f,(x)y\ = \{^f,(x)y\<di (3.i5) 
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for all real x and y, where d\, di-, d^ and d$ are constants. Then we obtain for any A > 

Pw(t)dt\ < -|(V(E[(W -W) 2 \W})) 1/2 

t(Jlt ^ t ^« + ^) (3 , 6 ) 



2 ' A A v 4 
+^E(|^(W)|) + ^E((W - W') 2 1 { |^-^|>a } ). 

4. Auxiliary results 

Let us fix the convenient that for k, t G {1, . . . , q} we will always write instead of , 

ky^m k—i 
k^m 

q 

instead of an d so on - For the proof of the multivariate normal approximations 

k,t^m fc,t=i 

ky^m,tj^m 

we will apply Theorem 13.11 and Theorem 13.21 respectively. In the introduction we already 
presented the construction of the exchangeable pair for W, the rescaled empirical spin vector of 
the Curie- Weiss- Potts model. We have already seen in (jl.lOp . that this construction will lead 
us to Gp t h, defined in (11. 5p . We collect some further results on this function. First we state 
a result on the structure of the minimizers of Gp^, determined in several papers, collected in 
HS1: 



Proposition 4.1. Let 0, h > and let x be a global minimum of Gp^. Then: 

(1) The vector x has the coordinate mm(xi) repeated q — 1 times at least. 

(2) If h > 0, then x\ > X{, for all i G {2, . . . , q}. 

(3) The inequality min(xj) > holds. 

(4) For any q > 3 and any ((3, h), or q = 2 and (f3, h) ^ (j3 c , 0), where (3 C denotes the critical 
temperature, one has min(xj) < 1/(3. 

An important identity is the following simple statement: 

Lemma 4.2. For u G M. q , we obtain 

exp ((3u m + hS m ,i) 1 d 

J2 exp ((3u k + h5 k: i) P ° Um 

k=l 

Proof. Direct calculation yields: 



-Gp. h {u) = - — -^(u,u) - log ^exp(/3w fc + hd k ,i) = (3u m -(3- 



du m ' du r 

Rearranging the equality gives the result. □ 



g 

,k=i J J Yj ex P (P u k + hS k ,i) 

k=l 
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Using the notation m(a) = (mi (a), . . . ,m q (a)) with 

l n 1 n 

ii{a) := -y j 5 ai>i and m M (cr) := -) 8 ai>i (4.1) 



rrii 

n * — ' •" n 
we obtain: 

Lemma 4.3. For arbitrary i G {1, . . . , q} we have 

cxp(/?mj,j (a-)) 

1 

E e 

Pp,h,n (0-j = i | (<T t )t#) = < 



E exp(^m fcj -(o-)+h<5 feily 



i G {2,...,g}; 
z = 1. 



Proof. For xi, . . . , x n G {1, . . . , q} we have: 

0,h,n (CTj - I I (<7t)t&) ~ 5 7T7-T • 

PpAnKWJttjt) 

For any fixed X\, . . . , Xj-i, Xj + i, . . . , x n we obtain 

P/3,h,n {o~l = ■ ■ ■ ; <7j = h ■ ■ ■ , 0~n = gre) 

Pf3,h,n (Cl — ^1) ■ ■ ■ ) — ^j-lj Oj'+l — ^j+lj ■ ■ ■ ,0~n = %n) 

(n n n 

L E K,x t + h + ~ E <W + ^ E <W + Hi 

q In n n 

E ex P [I E <W ( + & + S E <W + /»£*«„! + M*,! 
fc=i \ i,t# i+j i+j 

Cancelling equivalent terms in numerator and denominator and finally distinguishing between 
i — 1 and i ^ 1 yields the result. □ 

In the case h = in [13] Proposition 2.2] it is proved that the Hessian D 2 Gp i o(xo) of Gp t Q is 
positive definite if xq is a global minimum point, and hence invertible. In [26j Lemma 2] it is 
stated that D 2 Gp^{xo) is positive definite for any (3 > and ft, > 0, if x is a global minimum 
point. However this result is not correct. The non- degeneracy of Gp^ at its minimum points 
for any (/?, h) ^ (f3 , h ) is stated next and will be proved in the Appendix. 

Lemma 4.4. For all q > 2 let x s G 1R 9 denote a global minimum point of Gp t h- Then, if 
(/3,h) 7^ (Po,h ), (3 never takes one of the values {~— ; ~ \ x — } (implying that D 2 G / 3 i f l {x s ) is 
positive definite for any (/3, h) ^ (/3 , ho) ). 

For the rescaled empirical spin vector of the Curie- Weiss- Potts model, appearing in Theorems 
12.11 12.21 and 12.31 we can bound higher order moments as follows: 
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Lemma 4.5. For W = (Wi, . . . , W q ) as in Theorems \2.1l [Q and\2^ (the ) we obtain 
that for any I G N and j G {1, . . . , q] 

E|Wj| < const.(l), E\(WP) l \ < const.(l). 

Proof. We consider a well known transformation, sometimes called the Hubbard- Stratonovich 
transformation, expressing the distribution of L n in the Curie- Weiss-Potts model in terms of 
G/s t h- For > we pick a random vector F in a way that C(Y) equals a g-dimensional centered 
Gaussian vector with covariance matrix /3 _1 Id and Y is chosen to be independent from N. Id 
denotes the q x q identity matrix. According to a simple adaption of Lemma 3.2 in [13J, for 
any point m G 1R 9 and 7 G R and any n G N we have 



n(N/n — m) 



1/2-7 



n 



n 



1-7 



j = exp -nGfs.h (m+ ^ dy I J exp -nG Pyh (m + 



Lemma 3.2 in [13] presented this identity only for h = 0. The calculations for any h ^ are 
omitted. Applying this Lemma for 7 = | and m = x (or any other minimum point of Gp t h) 
does not change the finiteness of any of the moments of the Thus, the new measure has 
the density 



exp 



-nGp t h [x 



n 



1/2 



dy 



exp 



-nGf, th [x + 



dy 



Using second order multivariate Taylor expansion of Gph and the fact that Xq is a global 
minimum point of Gp^ we see that the density of the new measure with respect to Lebesgue 
measure is given by const, exp [— D 2 Gp j h(%o) y)] ( U P to negligible terms). With Lemma H~4l 
we know that for any h) 7^ (/3q, ho) the Hessian is positive definite, if Xq is a global minimum 
point. This fact combined with the transformation of integrals yields that a measure with this 
density has moments of any finite order. □ 



For the random variables T and V in Theorems 12.41 and 12.61 we can bound higher order 
moments as well: 



Lemma 4.6. Consider the extremity (f3, h) = (flo, h ). ForT and V as in (12.5P we obtain that 
for any I G N and j G {1, . . . , q} 

E\Vj\ < constat), E\T l \ < const. {I). 
Proof. Remark that with V G Ai Hu 1 - we obtain V\ = 0. Hence W\ = (1 — g)n 1 / 4 T. Therefore 

V = -^(N-nx- n*' A Tu) = W - n 1/A Tu = (0, W 2 + -A^r^i, ...,W q + T ^-rrW 1 ). 
n 1 ! 1 v ' [q — 1) o — l 
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Since W := (Wi,...,W q ) G M, we have Wi = -J2l=2 W k- We try to check that 
V := (V2, • • • , V g ) has finite moments. Thus is suffices to check, that (W 2 , • • • , W q ) has fi- 
nite moments. Now we define G q ^i(x), x G IR 9-1 , to be the restriction of Gp ^ on the last 
q — 1 coordinates (the first coordinate will be fixed to 1/2 in the sequel). Again we apply the 
Hubbard- Stratonovich transformation introduced in the proof of Lemma H~5l We choose a q — 1 
dimensional Gaussian vector Y with covariance matrix /J^Tdg-i and independent of W . With 
x = (l/2(g - 1), ... , l/2(q - 1)) G K 9_1 we have that the law of Y + W has the density 

-1 



exp 



-nG q -i (x + -^j^j dy | / exp -nG q ^ (x + ^J^j dy 



Using second order multivariate Taylor expansion of G q -\ and the fact that (VG,_i)(i) = ( 
(1/2, x) G R ? is a global minimum point of Gp 0j h ), we see that the density of the new measure 
with respect to Lebesgue measure is given by const, exp \—\{y, D 2 G q _i(x) y)\ (up to negligible 
terms). Using the formulas for the second partial derivatives of Gp 0} h , see Remark 16.31 in the 
Appendix, we obtain that 

D 2 G q ^(x) = i(Vi + Id,_i(g - l)(q - 2)), 

where i q -i denotes the (q — 1) x (g — 1) matrix with all entries equal to 1. It is an immediate 
computation that 



det( J D 2 G 9 _ 1 (x)) 



4( g -l)( g -2)V V%-l)(g-2) , %-l) 



q 2 J \ q 2 q 2 



which shows the invertibility of D 2 G q -i(x) for any q > 3. Thus D 2 G q -i(x) is positive definite. 
This fact combined with the transformation of integrals yields that a measure with this density 
has moments of any finite order. 

For (1 — q)T = n~ l ^Wi we apply the Hubbard-Stratonovich transform with 7 = 1/4. Take 
a Gaussian random variable with expectation zero and variance , independent of W\. The 
distribution of n~ l ^ A Y + T has a density proportional to exp(— nG\ (c g l/2 + y/-n, 1 / 4 )) with some 
constant c q only depending on q and G\ being the restriction of G^h to the first component. 
A fourth order Taylor expansion similar to (I6.4p will give G\(x + 1) = G±(x) + ■hG^{x + at)t A 
for some a G (0, 1). Hence we conclude with a Lebesgue-density given by const, exp (— y 4 ), a 
measure, which has moments of any finite order. We omit the details. □ 



5. Proofs of the theorems 



Proof of Theorem \2.1[ Our goal is to apply Theorem 13.11 First, given W, we construct a 
coupling W and will calculate A and R to get the approximative regression identity (11. 7p . 
We will first of all deal with the case h = 0. Hence by Theorem 11.11 we have (3 < (3 C and 
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xo = (1/g, . . . , 1/g) being the unique minimum point of Gp t o. By the construction given in the 
introduction, applying Lemma [4.31 and Lemma [4.21 we obtained for any % = 1, . . . , q: 



E[Wl-Wi\T\ 



n y/n y/n 
1 1 d 



l_d_ 

(5 du., 



n (3 du. 



G Bfi (m(a)) + R^\i) 



with 



1 1 



n n 



E 



exp (f3m itj (cr)) exp (fim^a)) 

Q Q 

exp ((3m k ,j(a)) ex P (/3m k (a)) 

,fc=l k=l 



(5.1) 



(5.2) 



where mj(cr) and mjj(cr) are defined as in (14. ip . We have used 

rrii{a) - x i = —^=. (5.3) 

Now we apply ( 16. 2 p (see Appendix) to the first summand in (15. ip . Since Xq is a global minimum 
of Gpfl we have (^-G^.o) fao) = 0. Hence the first summand in ( 15. ip is equal to 

<9 2 



^ ^ - ^ E (xo) W k + RV(i 



with 



R®(i) 



O 



1 Wi W, 



k^i 

Summarizing with R(i) := Rn\i) + Rn\i) we have 



n \/n \/n 



k,t^i 



i m 



n \/n \/n 



(5.4) 



E [W- — | J 7 ] = -J-(^ )(, o) ^-J-g( 



9 2 



duiduk 



Gp,o (x ) W k + R(i) 



([D 2 G 8 , (x )].,W)+R(i 



the i-th row of the matrix 



where (•, •) denotes the Euclidean scalar-product and [D 2 Gp t o(xo 
o(xo). We obtain 

E [W - W | T] = -— [D 2 G B ,o(x )] W + R{W) (5.5) 

with R{W) = (R(l),...,R(q)). We define A = j^[D 2 Gp fl {x Q )]. With [131 Proposition 2.2], 
D 2 Gpfl(v) is positive definite for any > and any global minimum point v and therefore 
A is invertible (alternatively one easily sees that A is a matrix of the form given in Lemma 
16.31 and the determinant is -E(l — f3/q) q ~ 2 {l — j3/q) which is non-zero because (3 C < q with f3 c 
given in (II. 2p . and therefore (3 ^ q, see Lemma IBTTj) . Hence (11.71) is fulfilled and we are able 
to apply Theorem 13.11 In order to calculate the bound given there we need to estimate as 
well as the order of the terms A, B and C. Note that often in an application of Theorem 13. II it 
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might be tedious to calculate A (and S) and it is not clear whether the calculations have been 
carried out correctly. In Remark 15.11 we will point out, that there is a nice heuristic in the 
Curie- Weiss-Potts model expecting A as it comes out. 

Obviously we have A® = 0(n). We continue by estimating C in Theorem 13.11 First we 
consider Rn\i) defined in (15. 2ft : 



1 1 " 



Wn n ' 



exp (/3m id (a)) exp (/3rrii(a)) 



q q 
E exp ({3m k ,j(cr)) E ex P (P m k (<?")) 

k=l k=l 



11 

< —j=~ EE I exp (P m iA a ) + P m k(v)) ~ exp {pmi{a) + f3m kJ (cr)) \. 



j=l k^i 

Using the inequality 



exp(ax) — exp (ay) I < ^(exp(ax) + exp (ay) ) \x — y\ , for all a,x,y G M, 



we obtain 



11 n 

\R^(i)\ < -^-Pe w V V |m M (a) + m fc (a) - 77^(0) - m fcJ (a)|. 

j=l fc^i 

Consider the first summand j = 1. In case a% = i, we have for all k i that 771^1(0) = m^c), 
and therefore 

E |77ij,i(«7) + m k (a) - 77^(0) - m M (0)| = (q - 1)%*. 

If 0i 7^ z, then there is a £ 7^ % with 777^1(0) 7^ m t (cr) and for all k ^ t: 777,^1(0) = 771^(0). By 
similar observation we have 

E l m i,i(°") + m fc(°") - "^(0) - ^fe.iC ")! <(q- 1 )^ £ - 

k^i 

The same observation can be made for any other j G {1, . . . , 77}. With < 1 we get 



Since PU G .M, see (j2l| . we get Efc^W* = -W 7 *- B Y Lemma S3] we know that E|H^ 2 | < 
const. (2) and therefore we obtain that E|i?i^(z)| in ( 15.4ft is 0(r7 -3 / 2 ). Thus the Cauchy-Schwartz 
inequality yields E[_R(i) 2 ] = 0(n~ 3 ) for all z G {1, . . . , q}. We have 



c = E ^ (i) v / EM = c^- 1 / 2 ). 

i=l 
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The next thing we notice is that \WI — Wi\ = ^|Y/ i — Y/-^] < for all i. Thus we easily obtain 
the bound B = O^rT 1 ! 2 ). It remains to calculate and to estimate the conditional variance in 
A. This is a bit more involved. We have: 



1 n 1 n 
t,k=l t,k=l 

2 n 

3 E Y ^ E Kj I =■ A, + A 2 + A 3 . 



n° 

t,k=l 



Hence we have to bound the variances of these terms. By definition VL4i] = -^V[mj(er) rrij(a)]. 
Now 

V[mi(a) mj{<j)\ = V( + -7=x j + -7^0,.) 



1 1 con st" 

< const. max(— V(Wi W}), -V(Wi)) < — ^ (E[WfWf ] + nE[W 2 }). 

We make use of Lemma 14.51 to obtain V[rrii(a) rrij(a)} = 0(l/n) and hence VL4i] = 0(n~ 3 ). 
Using a conditional version of Jensen's inequality we have 

-. n 1 " 1 n 

¥[A 2 ] < E(V[- £ Y^r;.} I ^ = E(V[- £ Yk.iYtj] I ^ = V (~3 E 
t,fe=i t,fc=i t,fe=i 

Hence V[t4 2 ] = (9(n~ 3 ). With Lemma H~3l we get 

Y n 1 n 

-V2 = raE y wE[^|Jl = ZsE y W 



exp (dm j)t (a)) 

*,fc=i t,&=i exp (f3mi ; t(cr)) 

i=i 

if y / exp (Pm j>t (cr)) expQr^q)) \ J_v^ v 
*fc=i v VexD (BmiJa)) exD C Bmi(a))' fc=i 



exp (Brrij(a)) 

■X)exp(/3mi,t(c7-)) J] ex P (/ 3m K cr )) / '" fc=1 I] exp (/3m ? (a) ) 
;=i z=i z=i 



= : .1/, • M 2 . 
By using the same estimations as for Rn\i) we obtain 

f i < ^3 E F M (« - ^ = 1 (9 - + *o,i). 

n 13 z — ' n \/n 

t,k=i 



Mi 



Hence V(Mi) = 0(n 3 ) by Lemma 1431 We obtain 

1 Id 

M 2 = -mi(a)(mj(a) - -—Gp t0 (m(a))) 

1 1 / d 2 

= -Tni(&)mj(<r) - — m^a) f (-^— G^ ) (ar ) (m i (o-) - ar j) 

+ Efa ^, )(xo)(m fe (<T) -x 0>k ) + ^R^(j)), 
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where the first equality follows from Lemma \A.2\ the second from (16. 2 j) and the definition of 
Rn\j) in J53D. Hence 



Mo 



C)( ^-m i (a)m i (or) j + £>Qm;(a xli 



0(-- 1/2 4 2) (j))- 



The first two summands are of order 0(Wj/n 3 ' 2> ) and the last term is of order 0{nT 2 ). Applying 
Lemma I4.5[ it follows that the maximal variance of all the sums in the representation of M 2 is 
of order 0(n~ 3 ) and therefore ¥(^3) = 0(n~ 3 ). Thus the variance in A of Theorem 13.11 can 
be bounded by 9 times the maximum of the variances of A±, A 2 , A 3 , which is a constant times 



n 



Thus we obtain 



A = E A (4 \/V [E[(W! - Wi){W' - Wj) I W]] = 0(nr x l 2 ). 
i,j=i v 

This completes the proof for h = 0. Remark, that we have used the fact that the fourth moment 
of Wi is bounded. We did not need the finiteness of any higher moment. We have proved a 
fourth-moment Theorem together with a rate of convergence of order 0{rT x l 2 \ 

If h 7^ we will slightly change the proof. Here are the details. By Theorem 11.11 we know 
that for h > and (/3, h) ^ hx, the function G^h has a unique global minimum point. Let xq 
be the unique global minimum point. Analogously to the first part of our proof we obtain 



0n 



\[D 2 Ge, h (x )] v W) + R(i,h) 



(5.6) 



with R(i, h) := Rn(i, h) + Rn'{i) with the new 



(2), 



1 " 

Jnn z — ' 

3 =1 



exp (j3niij(a) + h5, 



exp (Pmj(a) + h5i^) 

q q 

exp (fim k j(a) + h5 kA ) ex P (/3m k (a) + hS kA ] 

.k=l k=l 



and the same Rn (i) given in (15.41) . Again A = [D 2 Gp : h(xo)] ■ This matrix has a simple 
structure. With Lemma [6. II we obtain 



1 f d 2 \ l-/3(g-l)x ,ix . 

- — I — ^p,h j [ x 



fin \d 2 U\ 
Moreover 



n 



\ b = — 

13 n \duidu q 



9 2 \ , , (3 x Q,l x Q,q 
^P,h I { x 0) — 



n 



d = — 



d 2 



f3n \d 2 u 



-Gr 



fS,h { x 0) 



1 - P(x 0tl x 0>q + {q- 2)xg ) 



C= Tn{^ Gs - h ) iXa) 



n 



X O,Q 



n 



Hence A has the form (16. 3D and according to Lemma 16.21 we have 

det(A) = A ( X ~ P X 0,q) q ~ 2 (1 - QpX ,lX 0tq ). 
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So if B 4l {-^— , — - — }, the matrix A is invertible. With Lemma H3] we get that A is invertible 

for all (B,h) 7^ (Bo, ho) and hence we are able to apply Theorem 13.11 The bound of Rn(i,h) 
is e h times the bound of Rn\i) implying the same order of C. The proof of bounding B is 
unchanged. Bounding A needs once more the bound Rn\i, h) and hence the proof is almost the 
same as in the case h — 0. For the q-dependence we note that E [| W[ — Wi\ | Wj — Wj\\ W' k — Wk\] 
is independent of q and A® = 0(q). Thus B = 0(q A ) by summing three times to q. Previous 
estimations show that Rn = 0(q) and because we have two sums from 1 to q for Wk and W t 
we have that = 0(q 2 ). Thus E[\R(i) 2 ] = 0(q 4 ). Hence C = 0(q 4 ) by summing over 
q. Next we consider A\,Ai and A3 taken from the proof. While A\ and A2 are independent 
of q, A 3 depends on q via R$ and Rn^ ■ Summing two times over q, we obtain A = 0(q 5 ). 
Since the brackets in the bound of Reinert and Rollin still contain the parameter q and since 
|| EH 1 / 2 = 0(q), the constant C of our theorem satisfies C = 0(q 6 ). □ 

Proof of Theorem \2.2i Since the first part of the proof follows the lines of the proof of 12.11 we 
notice that Theorem 13.21 can be applied. Thus it remains to estimate the bound given there. 
For the first expression in the bound we notice that A\ is the same expression as the A-term we 
just calculated for the proof of 12.11 Hence, \og{n)A\ = (9(log(n)n -1 / 2 ) . With Lemma [4.51 and 
the estimation for the C-term in 12.11 we obtain that the second expression is (9(log(n)n -1 / 2 ) . 
For the third expression we notice that a > 1 is a constant and that A3 = 0(n). Using 
again Lemma 14.51 combined with the fact that A = ^= yields that the third expression is also 
0(\og(n)n~ l l 2 ) . Likewise we obtain that the fourth expression is 0(n~ 1 / 2 ). Comb ining these 
estimation yields the result. □ 

Remark 5.1 (Heuristics). By definition of Gp^, (II .5p . the Hessian of Gp^ fulfills D 2 Gp^{.%) = 
Bid — B 2 D 2 §(x), where $ is the log-moment generating function of the single-spin distribution 
in the Curie- Weiss- Potts model and x is any minimum point. Hence D 2 $> is the covariance 
structure of the single-spins, which is 

D 2 $(x) = —^(D 2 Gp A (x) - Bid) . (5.7) 

We know from Stein's method that if (W, W) is exchangeable and (11.71) is satisfied with R = 
we have 

-E[{W - W)(W - Wf] — E A*. 

On the one hand in the Curie- Weiss-Potts model we have X = [D 2 Gf3^{%)]~ 1 — (3~ l ld. On the 
other hand the left hand side describes the empirical covariance structure of the single-spins: 

±e[(w? - w t )(w; - Wj)] = ^E(y; i4 - Y I}i ) [y^ - Y Id ). 

Therefore with ( 15.71) . heuristically 

lEiiW'-WXW'-Wfl^^i-D^^ + Bld) = ([D'Gp^x)]- 1 - B-Hd) A*. 
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If we now choose A = A* = ■J^D 2 G^ i h(x), the right hand identity is fulfilled. 

Proof of Theorem \2.3l The proof uses the fact that the conditional joint distribution of the 
(<7j)j, conditioned on the event G B{x^ e)}, is given by 

I ( B n \ 

Pp,h,nM = v ex P 7T Yl +h z2 5 °a 1 B(x t ,e)(N/n), 

where Z^^ n t denotes a normalization. Thus we are able to start with any minimum point Xq 
and follow the lines of the proof of Theorem 12.11 □ 



Proof of Theorem\2.4\ We will apply Theorem 13.31 Obviously the density p is nice. Note that 



the logarithmic derivative is tp{t) = = — 16( - q ~^ t 3 . The solutions f g of the corresponding 
Stein equations (13.121) - with respect to absolutely continuous test functions g and with respect 
to g{x) = 1{ X < Z }( X ), z G R, respectively - fulfill all boundedness assumption of Theorem 13.31 
This was proved in (Till Lemma 2.2]. By definition of T, see (I2.5p . we have 

1 1 / - \ 

T = (1 - g)n3/4 ( Nl - HXl - ^ Vl ) = (1 - g)n 3/4 (Y, Y ^ - HXl - ^ Vl ) ■ 

We make use of the choice V G M. fl u L . With V G M. we have YH=i K = an d with 
(1/, u) = V 1 (l-q) + J2 9 i= 2 V i = we obtain 

£V i = _y 1 = o. 

i=2 

Constructing an exchangeable pair (T, T") is just the same as in the introduction: T' is a 
random variable being the same as T except that we pick an index I uniformly and exchange 
Yj t x with Y' lx (for I = i distributed according to the conditional distribution of given 
(Yj,i)j&, independently of Y it i). Now we calculate E[T" — T\ J 7 } with T = a((Ti, . . . , a q ). 

nr-r\F] = il _ 1 )n7/4 irE[Y> 1 -Y, 1 \T} 

^ ' i=l 

1 - 1 

= w\y' l tp] _ _7" _ Xl 

(l-g)n 7 / 4 e 1 : ' J n (l-g)n 3 / 4 " 



i=l 

With Lemma 14.21 we obtain 



1 11 d 1 



(l_ g ) n 7/4^ l Mi- J (i_ g ) ra 3/4V y /?o(9xi -po,«ov v (1 - g)™ 1 ^ 

Hence using mi(<r) = Xi + ^/i 7 " and defining R : = . _i 1 / 4 i?l 1 (i, /io) we have 
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A quite tedious fourth-order Taylor expansion of Gp 0; h at x + + -^f^ is affiliated in the 
Appendix, see ( 16. lip , which leads to 



d ( Tu V \ 16(g- 1) 



vi 



with f(v, t) given in (16. 7ft . Hence we obtain E[T' — T | J 7 ] = \ip(T) — R with 

g(g — l)/3o?i 3 / 2 

and 

-* : = ° (E ^r) + (19 + °(^7i/( l '/^. r /" 1/4 )) - «■ 

Now < A < 1 for all n G N and thus we can apply Theorem 13.31 The moments of T and V 
are finite, see Lemma H~6l With V\ — we get T = ( , 1 _^ 1/4 Wi. Now we are able to compute 
the expressions of the bound in Theorem 13.31 We have 

E[(T' - T) 2 |T] = {1 _ 1 q)2nl/2 n(W{ - W^) 2 W 

Reproducing the proof of Theorem Owe get Y {E[{W[ - W{) 2 \ 7}) = 0(n- 7 ' 2 ), using E|W l | < 
n '/4 E | T i| = 0{n 1 '^), I G N. Thus 

§ (V (E[(T' - T) 2 | T])) 1/2 = Oin- 1 ^). 

Moreover E|T' - T| 3 = ^ w 3/ a E|>j - */l = C>(™~ 9/4 ) and therefore ffE|T' - T| 3 = 
(9(n -3 / 4 ). From the proof of Theorem 12.11 we know that \Rn\i, ho)\ = 0(n~ 3 ^ 2 ), so R = 
0(n~ 7 ^). Remark that by QEZD we see that the expectation of Oi ^j^fiV/y/n^T/n 1 ^) J is of 



order 0(n 2 ). Summarizing we obtain i^/E[i? 2 ] = 0(n 7//4 ), hence 



Hence the 5 in ( I3.14p is of order 0(n~ 1 ^ 4 ). We obtain the same rate of convergence in the 
Kolmogorov distance, using \T' — T\ < c °^Jl' =: A. The order of the first two summands in 
(I3.16P is (9(n -1 / 4 ). The third term in (13.161) is of order (9(n -3//4 ) and finally 

3 4n^T)\< c -^m 3 \ = o(n-y% 

which completes the proof. □ 

Proof of Theorem \2.b\ Since V\ = 0, by the continuous mapping theorem it suffices to show 
that the random vector V := (V2, . . . , V q ) converges towards the (q — l)-dimensional centered 
Gaussian vector with covariance matrix 
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2 ( g -l)2(q_2) 



(q-2 -1 ... 
-1 q-2 -1 



V 



-1 \ 

-1 



■i -i q-2/ 

We will apply Theorem 13.11 We see for any % > 2 

i " i n 

E[V>- Vi | J] =E\Wl-Wi | J] = --^E^ + ^E E ^ I n 

.7 = 1 .7=1 



With Lemma I4T21 and Rn\i, h Q ) defined as in (15. 2p we obtain 



i=i 
1 9 



1 (9 



f3 Jndx, 



00 + ^171 + ^172 J +#i 1) Mo)- 



By the fourth- order Taylor expansion of -^Gp 0t h (x + -^fi + ^372), see ( 16. 6p . (16. 7p . (16. 9p and 
(I6.12p in the Appendix, we obtain for any i e {2, . . . , q} 



Wi -Vi\v) 



4 



■((!,..., 1, 



3q + 3),l...,l),V)+R i 



(5.9) 



/3 nq 2 

where (q — l)(q — 2) is the i'th entry of the vector in M 9-1 and 

* : = (E ^) + °(^^ ^ T /" 1/4 )) + °(^) + °(^) + ^ 

(see (16.71) for the definition of A 2 ). Using J2l=2 k^i ^fc = — ^ we § e ^ 

((l,...,l,(g 2 -3g + 3),l...,l),V) = ((g-l)(g-2))^. 

Thus the linearity condition of Theorem 13.11 is satisfied with A = 1 ( g ~ 2 ) /d g _ lxg _ 1 and = 
(i?2, • • • , With g — 2 > for q > 3 we get the invertibility of A and = 0(n). From the 
proof of Theorem 12.11 we see that 

mv; - Vi){v; - Vi) 1 f\ = e[{wi - w^w; - w 3 ) \ f\ = o{ n -^ 2 ) 

and thus A in Theorem 13.11 is of order 0{pT l l 2 \ Moreover \V- — Vj\ < for all % and thus 
B in Theorem 13.11 is of order (9(n~ 1//2 ). It remains to calculate C in Theorem 13.11 From the 
proof of Theorem 12.11 we know that ~E(Rn\i, ho)) = 0(n~ 3 ^ 2 ). Using bounded moments of Vi 
and T we obtain that the expectations of the first and third term of Ri are 0(n~ 5 ^). With 
(16. 7p we further get that the expectations of the second and fourth term of Ri are 0(n~ 3 / 2 ) and 
0(n~ 5 / 2 ), respectively. By the Cauchy-Schwartz inequality we get ^/E[R 2 } = C(n" 5 / 4 ), hence 
C = 0(n~ 1//4 ), which completes the proof. □ 
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Remark 5.2. In Remark l5.1l we gave a heuristic, that the matrix A in the regression condition 
( ll.7p should be expected to be j^D 2 Gp^(x)- Our heuristic is confirmed in the proof of Theorem 
12. 6j since we can rewrite ( 15. 9 p as 

E[V' - V | V] = --^—D 2 G q -i(x) V + R 
Pon 

where D 2 G q -i(x) denotes the upper (q—1) x (q—1) part of D 2 G / 3 0t h (x). The limiting covariance 
matrix £ of V is given by {D 2 G q -i(x)]~ 1 — jSyHdq-i. 

Remark 5.3. These rates of convergence remain still valid if we change our probability measure 
Pp,h,n to 

1 / 8 n q 

Pp,h,n{<y) = y exp — <Wj + Yl 2 5 °i> ihi 



X<i<j<n j=l i=l 



for 8 G R + and h G 1R 9 . This measure and the characteristics of the corresponding function 

G/3,h(u) := -(u,u) - log I ^exp (/3m + h { ) 

\i=l 

were studied in [26]. First of all we note that ( 11. ip is the same model with hi — h and hi = 
fori G {2, . . . , g}. Based on the results of [13] and [26] and following the same procedures as 
above our results can easily be extended to the case that hi ^ for i G {2, . . . , q}. We omit 
these extensions here. 



6. Appendix 

For the proofs of Theorems 12. 1[ 12.21 and 12.31 and for Lemma 14.41 and Lemma 14.51 we need a 
multivariate second-order Taylor- expansion of Gp^ defined in (ll.5p . for every (8, h) ^ ((3 , ho)- 
Let us denote by D 2 Gp ) h(x) the Hessian matrix {d 2 G/3 t h(x)/dxidxj,i,j — 1, . . . , q} of Gp 7 h at 
x. We obtain 

q d 1 
Gp ;h (u) = Gp <h (x) + ^—Gp th (x)(u k - x k ) + -((u- x),D 2 Gp th (x) ■ (u- x)) (6.1) 

fc=i fc 

1 9 ~ 

+ g ^ Rt,k,j(ut~ X t ){u k - X k ){Uj - Xj) 

t,k,j=l 

with \Rt,k,j\ <|| dujdutdu G/3,h II- F° r an y fixed to G {1, ■ ■ ■ ,q} and any x, u G M 9 it follows that 

d d d 2 

- — G^ h {u) = - — G(3, h (x) + — — Gp th (x)(u m - x m ) (6.2) 

Oil m Oil 777, O U m 



d 2 

+ ^ ^ T. Gj3 )h (x)(u k - X k ) + Y ®(( U m - X rn ){u k - X k )) 

uU k C'U rn 

k^tm k=l 
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+ 22 0((uk- X k ){Ut-X t )). 

If x is a global minimum point of Gp^ we are able to calculate the Hessian as follows: 
Lemma 6.1. The Hessian D 2 Gp ! h{xo) at an arbitrary global minimum point x looks like: 

Q2 g2 

g^Gp th (x ) = (3- (3 2 (q - l)x ,i x , q , q Ui q u G P,h( x o) = P 2 zo,i xo, q , 



and 



Q2 Q2 

Gp, h (x ) = (3 - /3 2 (x ,i x 0>q + {q- 2)x 2 ), - — — Gp )h (x ) = (3 2 x 2 



Proof. According to Proposition 14. II we know that for any minimizer x of Gp^ we have Xq^ = 
x 0t k for all i, k G {2, . . . , q} and x ,i > x Q ^ for all k & {2, ... , q} and Yl x o,i = 1- Notice that 

i=l 

the equation VG^oK) — implies 

exp(/3x ,i + 



Xq,1 



Xq,, 



exp(/3x ,i + /i) + (q - 1) exp(/3x , 9 ) 
exp(/3x , g ) 



,q exp(/3x ,i + /i) + (q - 1) exp(/3x , g ) ' 
Now we can calculate 

d 2 r / \ = o o2/- n exp(/3(x ,i + XpJ + fe) 

9V ^ 0> P PW J (exp(/3(a;o,i) + / i ) + (g-l)exp(/3a;o,,)) 2 
= (3 - (3 2 {q - l)xo,ix ,g 

and 

d 2 r / \ _ o2 exp{(3{x 0A +x , q ) + h) _ 2 

Sujfti, ^ ° j " ^ (expG8x 0| i + *0 + (? - 1) exp(/3x , g )) 2 " P 0,1 

Moreover 

d 2 r . v „ _ 2 exp(/3x 0;t? + fo)(exp(/3x 0i i + fe) + (g - 2) exp(/3x 0;g )) 
9%, P ' (exp^i + ^ + ^-^exp^xo,,)) 2 

= 0- /3 2 (x ,i x 0j(? + (g - 2)xg ) 



and 



d 2 2 exp(2/3x 



du 2 du q " p '"'^ v " ^ (exp(/3x ,i + /i) + (g - l)exp(/3x , g )) 2 ^ 

□ 



With Lemma [6.11 we get, that the Hessian of Gp^ at a global minimum point is a matrix of 
type ( 16. 3p . The following Lemma is some Linear Algebra for a matrix of the form ( 16.31) . 
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Lemma 6.2. For any a,b, c, d G K. consider the following matrix: 

... b\ 



A := 



(a b 

b d c ... 

b c d c 



G R q 



Xq 



(6.3) 



\b c c d ) 

Then det(A) = {d- c)i- 2 (a(d + (g - 2)c) - (g - l)b 2 ) . 

Proof. We applied the formula due to Laplace. 

Remark 6.3. At the extremity (/3o,h ) = (4^-,log(g — 1) — 2^^) of the critical line, x 
(1/2, l/2(g — 1), ... , l/2(q — 1)) is the unique minimum point of Gp ^ - Remark that 

exp(/3 • xi + h ) = (q - 1) exp(2/q), exp(/3 • x q ) = exp(2/g). 

With Lemma 16.11 we obtain 



□ 



d 2 G 



Po, ho 



d 2 G 



and ^oM, 



X 



A(q 2 -3q+3) 

«2 



and 



d 2 Xi 

d 2 G 



x) 



0o,ho 



X 



4(g-l) 



dxidx q ' q* 
^(x) = Thus a = b in Q . 



For the proofs of the results at criticality, Theorems \2A\ 12.61 and Lemma 14. 6[ we need a 
multivariate fourth-order Taylor- expansion of G/3 0i h (defined in (II. 5ft ). We fix the notation 
G := Gp 0} h for the following calculations. We know that x = (1/2, l/2(q — 1), ... , l/2(g — 1)) 
is the unique minimum point of G. Let u = (1 — g, 1, . . . , 1) GM C M 9 , u G .M fl u 1 - and tel. 
For any p G N and z6l' let us fix the notation 

dPG 



)(*) 



A second-order Taylor-expansion yields 



1 1 9 

G(x + tu + v) = G(x + tu) + -(v, (D 2 G) (x + tu)v) + - Rj,k,i( x + tu + jv) 



VjV k Vi 



j,k,l=l 



for some 7 G (0, 1), since ((VG)(i + tu),v) = 0: the last g — 1 coordinates of x + £u are equal 
and with Lemma 14.21 the last q — 1 coordinates of the gradient (VG) (x + tw) are equal, and 
hence it is orthogonal to v. A fourth-order Taylor-expansion for G(x + tu) yields 

1 9 

G(X + tu) = G(x) + — ^ Rj,k,l,m{x + jtu) t 4 UjUkUlUm (6.4) 

j,k,l,m=l 

for some 7 G (0, 1). To see (16 ,4p notice that the first-order term is zero since x is a global 
minimizer of G. The second-order term is zero, since we know from Lemma 14.41 that D 2 Gp } h{x) 
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is positive definite if and only if (/?, h) ^ (/3o, ho). Hence the third-order term is zero yielding 
the identity (16. 4p . Summarizing we obtain: 

1 1 q 

G(x + tu + v) = G(x) + -(v , (D 2 G)(x + tu) v ) + - 22 Rj, k ,i(x + tu + ^v) VjV k vi 

j,k,i=i 

1 9 

+ 24 ^ Rj,k,l,m(x + jtu) t 4 UjU k UlU m . (6.5) 

With y := x + tu + v we will calculate -J^G(y) for i E {1, . . . , g} using (I6.5p . The derivative of 
the first summand in (16.51) is zero since x is the global minimizer of G. With 

-(v, (D 2 G)(x + tu) v) = -Ri,i(x + tu)(yi- Xi-tUif 

+ ^ R i,k {X + tu) (y k ~X k - tU k ) {Vi -Xi- tUi) 
k^i 

+ ^ R iA x + tu )(Vk -x k - tu k )(yi - Xi- tui) 

we obtain 

d 1 

Ai(i) := — (-(v,(D 2 G)(x + tu)v)) = R^{x + tu)v t + ^2R itk {x + tu)v k . 

With Lemma EH] we obtain i?i j2 = = ■■ ■ = R lq and since v G u 1 - we have Ai{\) = 0. 
With a second-order Taylor expansion for Ri )k {x + tw) we get for t small 

q 

A x {i) = (R it .,v)+O g Q2vit). 

i=i 

Here O q is the notation that the constant does depend on q. This is because all R^^.^x) 
only depend on q, since x only depends on q. The second partial derivatives of G were listed 
in Remark 16.31 hence we end up with 

A!(1) = 0, A, (t) = 4(g3 ~ \ q + 3) Vi + - 2 Yl v * + °« (E *) ' * ^ 2 ( 6 - 6 ) 

^ ^ fc=2 Z=l 



for small t. The last formula can even be simplified since Yl k =2 v k = using t> E tr 1 . For 

a 

9y t 



reasons of application we will not use this simplification. The partial derivative -Jr- of the third 



term in f 1 6 . 5 1) is 

1 q 1 

A 2 {i) := -Ri,i,i(x + tu + ^v)v 2 + ^ Ri,i,k(x + tu + ~/v) v(v k + Ri,j,k( x + tu + l v ) v kVj- 

k^=i j,ky^i 

Using Taylor for R it j k (x + tu + ^v) we obtain for small t and small u 

1 q 1 q 

A 2 (i) := A 2 («, f , t) := -Ri^i(x)v 2 + ^ Ri,i,k{x) v, t v k + — R%,j,k{x) v k Vj + O q (f(v, t)) (6.7) 

k^i j,k^i 
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with 

O q (f(v, t)) = O q ({t + J2 c k {q)v k ) («? + Vi v k + Yl VkV i) ) ■ 

^ k=2 k^i k,j=£i ' 

Here c k (q) is denoted (a constant depending on q) just to see that the relation YH=2 v k = 
cannot be applied. In our application the order (in n) of the v k s will not depend on k, and 
v 2 will be smaller in order than t. In this situation we have O q (f(v,t)) = g (£i>§). We will 
calculate ^(1) with the help of the third derivatives which are Ri } i,i (x) = R\ t x,k(%) = and 

16(g-l)(g-2) 1%-1) 
Rijj{ x ) ^ > K hhkK x ) ^3 • 

Therefore the first two summands in f !6.Tj) are zero and the third term is, using YH=2 v k = 0> 

ifo m 1^ 16(g-l)(g-2) 2 1 A 1% - 1) 

j,k=2 j=2 q j,k=2,j^k H 

i=2 k=2,kft 



7 3 "j 



* J=2 



Finally, the partial derivative of the fourth term in ( 16. 5ft is 



A 3 (i) := A 3 (i, t) := -R4 Aiti (x)t 3 u^ + - R ititi!k (x)t 3 v%u k + R i , i j. k (x)t i u, i u k u : j 

+ ^ X) ^,m(s)*S u *«J + «(* 4 )> ( 6 - 9 ) 



6 

j,k,l^t 



where we applied a second-order Taylor expansion for Rij^^x + jtu). Again we calculate 
A 3 (l), using the fourth derivatives 

p M _ 32(g-1) 4 _ 32(g - l) 3 32(g-l) 2 



and 



32(g-l)(2g 2 -10g+ll) 32(g - l)(2g - 5) 

-Kl,fc,fc,fc^J — | , n l,j,j,k\ X ) ~ 4 

96(g-l) 



and R\,j,k,i — 4 ■ We obtain 



l n , , „„ „ 16(g-l) 7 t 3 . . 3/1 N2 16(g-l) 6 t 3 
-i^i^a - g) 3 = W 3 4 J , - X it: lil|1)A (x)i 3 (l - g) 2 = W 4 ; 

and 

, . „ , 16(g- l) 5 t 3 
- £ /» > :.,,,!.r)/ :i ( l - g) = yq > . 
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Moreover we have 



1 V- p f u3 16(g-l) 2 (2g 2 -10g + ll)t 3 16(g - l) 2 (q - 2)(2q - 5)t 3 
6 ^ " 3? ? 

| 16(g-l ) 2 (g-2)(g-3)t 3 

Hence 



g 4 



A 3 (l) = - 16(g , 1} t 3 + g (t 4 ). (6.10) 
We summarize that the first partial derivative of G(y) in (I6.5P satisfies 

A G(a; + tu + v) = jy s - ^=^V + <?,(/(„, t)) + g (t 4 ), (6.11) 

using the notation of (16.71) . The z'th partial derivative for i e {2, . . . , q} is given by 

— G(x + + v) = A x (i) + A 2 (i, v, t) + A 3 (z, t) (6.12) 
with Aj-(i) defined in (ffTBjl . (IQl and (lOl) . 



Proof of Lemma \4-4\ F° r the proof we will use the following alternative parametrization of the 
minimum points of Gp^ given by permutations of 

V 9 ? 9 / ' 

It is important to notice, see for example [3], that s((3,h) is positive, well-defined and strictly 
increasing in p on an open interval containing \J3 C , oo) and that s(/3 c , 0) = (q — 2)/(q— 1) is a 
global minimum. The Lemma follows once we proof that Pqx S) iX s>q — 1 < and (3x Sjq — 1 < 0. 
The second inequality follows directly from Proposition 14.11 To prove Pqx St \X SA — 1 < 0, we 
first consider the case h = ho- First of all we note that the minima are the solutions of 

f(s(P, ho)) = log(l + (q - l)s(P, h )) - log(l - s(P, ho)) - Pstf, ho) - h = 0. 

If P < p we have that df(s(f3,h ))/ds > 0. Rearranging this equality and using the 
parametrization of x s yields the result. If P > P we use VGp^ a (x s ) = to obtain 

log(x S)1 ) - px SjX -ho = Iog(a:,, 9 ) - px SiT 

This equation yield 

/log(ar a ,i) -log(ar,, g ) - h \ ( f x s A \ x sA x s , q 
Pqx S;1 x S!q -1=1 I qx s ,ix s>q - 1 = I log I I - ho I q 1. 

\ Xg^x %s,q J \ \^s,q/ J X St \ x s,q 

Using the fact that x Ss x + (q — l)x Sjq = 1 we obtain 

ft f ~ l) x si\ \ x s i(l — x s i) 
pqx s ,ix s ,q - 1 = log — - ho] q-. - 1 - 1 - 1 



/ / (q-l)x sA - (1- x Sil ) 
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log 



(q-l)x s>1 \ \ x S}1 (l - x S) i) x S)1 (l - x S) i) 

1 - / J qx s> i - 1 gx a i - 1 



where 



,'(g - l)x a ,\\ , qx a ,i ~ 1 
fh(x 3 ,i) ■= log — '- ) -h- 



l-x,,i J gx Sj i(l - x Sj i)' 

For Po the global minimizer is given by y = (1/2, 1/2 (g — 1), . . . , 1/2 (g — 1)) and it follows easily 
that fh (|) = 0. Additionally -^r{x) < for x G [2 _1 , 1). Thus we obtain that fh (x s ,i) < 
for x 8> i G [2 _1 , 1), and this is equivalent to fiqx s ^x SA — 1 < 0. 

Now we consider the case: h ^ h . If ft < fl Q , the proof is identical to the case of /3 < /3q for 
h = h . If f5 > Po we have 

/3qx s ^x Sjq - 1 = q — Jh{x s ,i)- 

qx s ,i - 1 

For > 0o we know that 1 > s(/3, h) > s(/3 , /i) > s(/3 c , 0) = (q - 2)/(q - 1). The rest of the 
proof is identical to the case (ii) of the proof of Proposition 2.2 in [13]. □ 
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